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DETAILED ACTION 

Continued Examination Under 37 CFR 1. 1 14 

1 . A request for continued examination under 37 CFR 1 . 1 1 4, including the fee set 
forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this 
application is eligible for continued examination under 37 CFR 1.114, and the fee set 
forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action 
has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 09 
September 2005 has been entered. 

2. Claims 1 -3, 6-1 4 and 1 8-21 are pending in this application. Claims 1 , 8 and 1 3 
are independent claims. In the Amendment filed with the RCE of 09 September 2005, 
claims 1,8,13,18 and 1 9 were amended; claims 4-5 and 1 5-1 7 were cancelled; and 
claims 20-21 were added. This action is non-final. 

Claim Rejections - 35 USC § 103 

The text of those sections of Title 35, U.S. Code not included in this action can 
be found in a prior Office action. 

3. Claims 1, 13, 20 and 21 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over U.S. Patent No. 5,515,490 issued to Buchanan et al. (hereafter 
Buchanan '490) in view of U.S. Patent No. 5,649,060 issued to Ellozy et al. (hereafter 
Ellozy '060), and further in view of the publication, "Cooperative Use of MHEG-5 and 
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HyTime", by Rutledge et al., published by Proceedings of Hypertext and Hypermedia, 
1997 (hereafter Rutledge '97). 
Claim 1: 

Regarding Claim 1 , Buchanan '490 discloses an automatic temporal formatter for 
synchronizing multimedia data streams such as video, audio, and text (e.g. subtitles). 
Specifically, Buchanan '490 discloses: a computer-based method of synchronizing a 
realization of a media (Buchanan '490: Abstract) stream having at least one version of 
content and having a first representation synchronized with said realization, and at least 
one second representation (Buchanan '490: col. 57, Ins. 11-13), said method 
comprising: 

- determining structure information for said first representation and said at least 
one second representation (Buchanan '490: col. 23, Ins. 59-65; col. 57, Ins. 20- 
30); 

- determining structure association information between said first representation 
and said at least one second representation (Buchanan '490: col. 23, In. 66 to 
col. 24, In. 10; col. 57, Ins. 31-50); 

- synchronizing said at least one second representation with said first 
synchronized representation and said realization using said structure association 
information (Buchanan '490: col. 24, Ins. 11-15; col. 57, Ins. 51-63; col. 58, Ins. 
9-23); and 

- aligning said at least one version of content with said first representation to 
produce linked relationships between a structural view of said at least one 
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version of content and said first representation (Buchanan '490: col. 24, Ins. 11- 

15; col. 57, Ins. 51-63; col. 58, Ins. 9-23 - Note: schedule commands within data 

structure link events from content and representations). 
However, Buchanan '490 does not explicitly disclose: wherein said structure 
association information includes semantic structure association information; and 
wherein said aligning produces "a web of relations" as claimed. 

Ellozy '060 discloses: wherein said structure association information includes 
semantic structure association information (Ellozy '060: col. 9, In. 64 to col. 10, In. 15 — 
note that this information is used to align an audio representation and a text 
representation strictly using word content i.e. semantic structure information, rather than 
temporal information; see Ellozy '060: col. 3, In. 31 to col. 4, In. 48). 

Rutledge '97 discloses MHEG-5 and HyTime (Hypermedia/Time-based 
Structuring Language): producing a web of relations (Rutledge '97: Section 2, titled 
"Standards for Hypermedia", second paragraph). It is noted that applicants' 
specification describes "producing a web of relations" as creating a HyTime document 
to realize the structural links (relations). 

It would have been obvious to a person having ordinary skill in the art to augment 
the temporal alignment means of Buchanan '490 with the semantic structural alignment 
means of Ellozy '060, and further to apply the HyTime language of Rutledge '97 to 
realize the structural links (web of relations) produced by Buchanan '490 to obtain the 
invention as claimed.. The motivation to combine is suggested by Ellozy '060 which 
discloses that use of the means of Ellozy '060 expands the audio-video data that 



Application/Control Number: 09/994,544 Page 5 

Art Unit: 2161 

Buchanan '490 may operate on by providing support for audio-video data that are not 
time correlated (Ellozy '060: col. 1, Ins. 64-67); and Rutledge '97 which discloses: 
HyTime especially in cooperation with MHEG-5 provides a particularly advantageous 
combination for the encoding of hypermedia (and multimedia) presentations (Rutledge 
'97: Abstract).. 
Claim 13: 

Examiner notes that Claim 13 is the apparatus embodiment of Claim 1 and is 
rejected on the same basis. 
Claims 20 and 21: 

Regarding Claims 20 and 21, Buchanan '490, Ellozy '060 and Rutledge '97 in 
combination disclose the method of claim 1 and the storage of claim 13, as above, 
wherein the step of synchronizing said at least one second representation with said first 
synchronized representation and said realization is done using only said semantic 
structure association information (Ellozy '060: col. 9, In. 60 to col. 10, In. 15 - note that 
this information is used to align an audio representation and a text representation 
(summary transcript) strictly using word content i.e. semantic structure information, 
rather than temporal information) as claimed. 

4. Claims 2-3 and 14 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Buchanan '490 in view of Ellozy '060 and Rutledge '97 as applied to claims 1 and 
13 above, and further in view of the publication, "Synchronization Relation Tree : A 
model for Temporal Synchronization in Multimedia Presentation", by Kim et al. 
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published as Technical Report TR92-42, by the Dept. of Computer Science, Univ. of 
Minnesota, 1992 (hereafter Kim '92). 
Claim 2: 

Regarding Claim 2, Buchanan '490, Ellozy '060 and Rutledge '97 in combination 
disclose all the limitations of Claim 1 (supra). Additionally, Buchanan '490, Ellozy '060 
and Rutledge in combination disclose: said step of determining structure information 
further comprising: analyzing said structure information of said first and said at least one 
second representation (Buchanan '490: col. 23, Ins. 59-65; col. 57, Ins. 20-30). 
Furthermore, Buchanan '490, Ellozy '060 and Rutledge '97 in combination disclose 
providing a stream of temporal data (Buchanan '490: col. 23, Ins. 59-65; col. 3, Ins. 40- 
47, note that data provided continuously over runtime reads on a stream). However, 
Buchanan '490, Ellozy '060 and Rutledge '97 in combination do not explicitly disclose: 
the stream of temporal data comprised of tree locators. 

Kim '92 discloses a synchronization relation tree (Kim '92: Abstract). (Note that 
a data structure that contains pointers to data corresponding to the nodes rather than 
the data itself reads on tree locators). 

It would have been obvious to a person having ordinary skill in the art to apply 
the synchronization relation tree of Kim '92 to the automatic formatter of Buchanan '490, 
Ellozy '060 and Rutledge '97 in combination. The motivation to combine is suggested 
by Kim '92 which discloses: the synchronization relation tree provides for both 
"temporal relationship consistency" and "dynamic schedule completion" and further is 
better suited for an object-oriented implementation (Kim '92: p.3, In. 38 to p. 4, In. 3). 
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Claim 3: 

Regarding Claim 3, Buchanan '490, Ellozy '060, Rutledge '97 and Kim '92 in 
combination disclose all the limitations of Claim 2 (supra). Further note that Buchanan 
'490 and Kim '92 in combination disclose: aligning said determined structure information 
of said first representation and said at least one second representation (Buchanan '490: 
col. 24, Ins. 11-15; col. 57, Ins. 51-63; col. 58, Ins. 9-23) using said semantic structure 
association information in a form lacking temporal information (Ellozy '060: col. 3, In. 31 
to col. 4, In. 48; col. 1, Ins. 64-67). 
Claim 14: 

Examiner notes that Claim 14 is the apparatus embodiment of Claim 2 and is 
rejected on the same basis. 

5. Claims 6 and 18 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Buchanan '490, Ellozy '060, Kim '92, and Rutledge '97 in view of U.S. Patent No. 
5,731,847 issued to Tsukagoshi et al. (hereafter Tsukagoshi '847), in further view of 
U.S. Patent No. 5,794,197 issued to Alleva et al. (hereafter Alleva '197), and morever in 
view of the publication, "Using the Strategy Design Pattern to Compose Reliable 
Distributed Protocols", by Garbinato et al. published by the USENIX Conference on 
Object-Oriented Technologies and Systems, 1 997 (hereafter Garbinato '97). 

Regarding Claim 6, Buchanan '490, Ellozy '060, Kim '92, and Rutledge '97 in 
combination disclose all the limitations of Claim 3 (supra). Further note that Buchanan 
'490, Ellozy '060, Kim '92, and Rutledge '97 in combination disclose: aligning media 
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streams (Buchanan '490: col. 24, Ins. 11-15; col. 57, Ins. 51-63; col. 58, Ins. 9-23). 
However, Buchanan '490, Ellozy '060, Kim '92, and Rutledge '97 in combination do not 
explicitly disclose: aligning an audio stream specified by said media stream with an 
audio structure corresponding to said audio stream. 

Tsukagoshi '847 discloses an encoder and decoder of subtitle information. 
Specifically, Tsukagoshi '847 discloses: aligning an audio stream specified by said 
media stream (Tsukagoshi '847: col. 1 1 , Ins. 45-50). However, Tsukagoshi '847 does 
not explicitly disclose: aligning with an audio structure corresponding to said audio 
stream. 

Alleva '197 discloses a specific alignment of an audio structure from an audio 
stream (Alleva '197: col. 13, Ins. 40-46). Note that while analysis of an audio stream 
under Tsukagoshi '847 is optional, the combination of Alleva '197 to Tsukagoshi '847 
requires the generation and subsequent alignment of an audio structure from an audio 
stream. However, Alleva '197 does not explicitly disclose: aligning with an audio 
structure corresponding to said audio stream 

Garbinato '97 discloses the well-known Strategy design pattern. Specifically, 
Buchanan '97 discloses that objects designed to handle distinct types of data and/or 
interactions are to be distinct via the Strategy design pattern (Garbinato '97: p. 1, col. 2, 
Ins. 14-27). 

It would have been obvious to a person to apply the augment the automatic 
formatter of Buchanan '490, Ellozy '060, Kim '92, and Rutledge '97 in combination with 
the rate controller with the encoder/decoder of Tsukagoshi '847. The motivation to 
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combine is suggested by Buchanan '490 which discloses: the automatic formatter of 
Buchanan '490, Ellozy '060, Kim '92, and Rutledge '97 operates during run-time 
(Buchanan '490: col. 3, Ins. 11-15) and further that application of the automatic 
formatter of Buchanan '490, Ellozy '060, Kim '92, and Rutledge '97 combined with 
Garbinato '97 provides the advantage of handling unpredictable data changes such as 
that of the runtime subtitle to video/audio matching of Tsukagoshi '847 (Buchanan '490: 
col. 3, Ins. 40-47; col. 6, Ins. 7-10). 

It would have been further obvious to a person having ordinary skill in the art to 
modify the Buchanan '490, Ellozy '060, Kim '92, Rutledge '97, and Tsukagoshi '847 
combination to Alleva '197. The motivation to combine is suggested by Alleva '197 
which discloses that utilization of the invention of Alleva '197 provides a particularly 
advantageous means to model speech and audio, such as that of the subtitle 
information of Buchanan '490, Ellozy '060, Kim '92, Rutledge '97, and Tsukagoshi '847 
(Alleva '197: col. 2, Ins. 49-63). 

It would have been moreover obvious to a person having ordinary skill in the art 
to modify the Buchanan '490, Ellozy '060, Kim '92, Rutledge '97, Tsukagoshi '847, and 
Alleva '197 combination by separating the structuring functions of the first and second 
operations into distinct aligner objects as per the Strategy design pattern of Garbinato 
'97. The motivation to accomplish said modification is suggested by Garbinato '97 
which discloses that encapsulating the aligner implementations into separate objects 
and invoking via a Strategy design pattern provides the advantages of providing both 
design time and runtime composition of aligner implementations and furthermore 
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overcomes the limitations of an inheritance based implementation (Garbinato '97: p. 3, 
col. 2, In. 3 to p. 4, col. 1, In. 24). 
Claim 18: 

Examiner notes that Claim 18 is the apparatus embodiment of Claim 6 and is 
rejected on the same basis. 

6. Claims 7 and 19 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Buchanan '490, Ellozy '060, Rutledge '97 and Kim '92 in view of U.S. Patent No. 
5,731 ,847 issued to Tsukagoshi et al. (hereafter Tsukagoshi '847), in further view of the 
publication, "Using the Strategy Design Pattern to Compose Reliable Distributed 
Protocols", by Garbinato et al. published by the USENIX Conference on Object-Oriented 
Technologies and Systems, 1997 (hereafter Garbinato '97). 
Claim 7: 

Regarding Claim 7, Buchanan '490, Ellozy '060, Rutledge '97 and Kim '92 in 
combination disclose all the limitations of Claim 3 (supra). Further note that Buchanan 
'490, Ellozy '060, Rutledge '97 and Kim '92 in combination disclose: aligning media 
streams (Buchanan '490: col. 24, Ins. 11-15; col. 57, Ins. 51-63; col. 58, Ins. 9-23). 
However, Buchanan '490, Ellozy '060, Rutledge '97 and Kim '92 in combination do not 
explicitly disclose: aligning a text stream specified by said media stream with a text 
structure corresponding to said text stream. 

Tsukagoshi '847 discloses an encoder and decoder of subtitle information. 
Specifically, Tsukagoshi '847 discloses: aligning a text stream specified by said media 
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stream (Tsukagoshi '847: col. 11, Ins. 28-35). However, Tsukagoshi '847 does not 
explicitly disclose: aligning with a text structure corresponding to said text stream. 

Garbinato '97 discloses the well-known Strategy design pattern. Specifically, 
Buchanan '97 discloses that objects designed to handle distinct types of data and/or 
interactions are to be distinct via the Strategy design pattern (Garbinato '97: p. 1, col. 2, 
Ins. 14-27). 

It would have been obvious to a person to apply the automatic formatter of 
Buchanan '490, Ellozy '060, Rutledge and Kim '92 for the rate controller with the 
encoder/decoder of Tsukagoshi '847. The motivation to combine is suggested by 
Buchanan '490 which discloses: the automatic formatter of Buchanan '490, Ellozy '060, 
Rutledge '97 and Kim '92 operates during run-time (Buchanan '490: col. 3, Ins. 11-15) 
and further that application of the automatic formatter of Buchanan '490, Ellozy '060, 
and Kim '92 with Garbinato '97 provides the advantage of handling unpredictable data 
changes such as that of the runtime subtitle to video/audio matching of Tsukagoshi '847 
(Buchanan '490: col. 3, Ins. 40-47; col. 6, Ins. 7-10). 

It would have been further obvious to a person having ordinary skill in the art to 
modify the Buchanan '490, Ellozy '060, Rutledge '97, Kim '92, and Tsukagoshi '847 
combination by separating the structuring functions of the first and second operations 
into distinct aligner objects as per the Strategy design pattern of Garbinato '97. The 
motivation to accomplish said modification is suggested by Garbinato '97 which 
discloses that encapsulating the aligner implementations into separate objects and 
invoking via a Strategy design pattern provides the advantages of providing both design 
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time and runtime composition of aligner implementations and furthermore overcomes 
the limitations of an inheritance based implementation (Garbinato '97: p. 3, col. 2, In. 3 
to p. 4, col. 1, In. 24). 
Claim 19: 

Examiner notes that Claim 19 is the apparatus embodiment of Claim 7 and is 
rejected on the same basis. 

7. Claim 8 is rejected under 35 U.S.C. 103(a) as being unpatentable over Buchanan 
'490 in view of Ellozy '060 and Rutledge '97 as applied to claim 1 above, and further in 
view of Garbinato '97. 
Claim 8: 

Regarding Claim 8, Buchanan '490 discloses: a system for synchronizing a 
realization of a media stream (Buchanan '490: Abstract) having at least one version of 
content and having a first representation synchronized with said realization, and at least 
one second representation, (Buchanan '490: col. 57, Ins. 11-13) said method 
comprising: 

- a structurer configured to determine structure information for said first 
representation (Buchanan '490: col. 23, Ins. 59-65; col. 57, Ins. 20-30); 

- a structurer configured to determine structure information for said at least one 
second representation (Buchanan '490: col. 23, Ins. 59-65; col. 57, Ins. 20-30); 
and 
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- a first aligner configured to align said structure information for said first 
representation and said at least one second representation (Buchanan '490: col. 
23, In. 66 to col. 24, In. 10; col. 57, Ins. 31-50); 

- wherein said first aligner produces linked relationships between a structural view 
of said at least one version of content and said first representation (Buchanan 
'490: col. 24, Ins. 11-15; col. 57, Ins. 51-63; col. 58, Ins. 9-23 - Note: schedule 
commands within data structure link events from content and representations). 

However, Buchanan '490 does not explicitly disclose: 

- wherein said first aligner aligns in part at least a semantic structure association 
information lacking temporal data forming a portion of said structure information 
for said first representation and said at least one second representation; 

- that the structurer for the first representation and the structurer for the second 
representation are distinct; and wherein said aligning produces "a web of 
relations" as claimed. 

Ellozy '060 discloses: 

- wherein said first aligner aligns in part at least a semantic structure association 
information lacking temporal data forming a portion of said structure information 
for said first representation and said at least one second representation (Ellozy 
'060: col. 3, In. 31 to col. 4, In. 48; col. 1 , Ins. 64-67); 

However, Ellozy '060 does not explicitly disclose that structurer for the first 
representation and the structurer for the second representation are distinct. 
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Garbinato '97 discloses the well-known Strategy design pattern. Specifically, 
Garbinato '97 discloses that objects designed to handle distinct types of data and/or 
interactions are to be distinct via the Strategy design pattern (Garbinato '97: p. 1 , col. 2, 
Ins. 14-27). 

Rutledge '97 discloses MHEG-5 and HyTime (Hypermedia/Time-based 
Structuring Language): producing a web of relations (Rutledge '97: Section 2, titled 
"Standards for Hypermedia", second paragraph). It is noted that applicants' 
specification describes "producing a web of relations" as creating a HyTime document 
to realize the structural links (relations). 

It would have been obvious to a person having ordinary skill in the art to augment 
the temporal alignment means of Buchanan '490 with the semantic structural alignment 
means of Ellozy '060, and further to apply the HyTime language of Rutledge '97 to 
realize the structural links (web of relations) produced by Buchanan '490. The 
motivation to combine is on the same basis as Claim 1 (supra). 

It would have been further obvious to a person having ordinary skill in the art to 
modify Buchanan '490, Ellozy '060 and Rutledge '97 by separating the structuring 
functions of the first and second operations into distinct structurer objects as per the 
Strategy design pattern of Garbinato '97. The motivation to accomplish said 
modification is suggested by Garbinato '97 which discloses that encapsulating the 
structurer implementations into separate objects and invoking via a Strategy design 
pattern provides the advantages of providing both design time and runtime composition 
of structurer implementations and furthermore overcomes the limitations of an 
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inheritance based implementation (Garbinato '97: p. 3, col. 2, In. 3 to p. 4, col. 1, In. 
24). 

8. Claim 9 is rejected under 35 U.S.C. 103(a) as being unpatentable over Buchanan 
'490, Ellozy '060, Rutledge '97 and Garbinato '97 in view of Tsukagoshi '847. 
Claim 9: 

Regarding Claim 9, Buchanan '490, Ellozy '060, Rutledge '97 and Garbinato '97 
in combination disclose all the limitations of Claim 8 (supra). However, Buchanan '490, 
Ellozy '060, Rutledge '97 and Garbinato '97 in combination do not disclose: at least one 
renderer configured to render said at least one second representation, after being 
synchronized, in a form suitable for displaying as an overlayed subtitle. 

Tsukagoshi '847 discloses an encoder and decoder of subtitle information. 
Specifically, Tsukagoshi '847 discloses: at least one Tenderer configured to render said 
at least one second representation, after being synchronized, in a form suitable for 
displaying as an overlayed subtitle (Tsukagoshi '847: col. 16, Ins. 1-15). Note that 
Tsukagoshi '847 teaches "a rate controller which controls the rate that the compressed 
video is transferred to the multiplexer in synchronism with the rate that the subtitles are 
sent to the multiplexer" (Tsukagoshi '847: col. 11, Ins. 37-43). 

It would have been obvious to a person to apply the augment the automatic 
formatter of Buchanan '490, Ellozy '060, Rutledge '97 and Garbinato '97 for the rate 
controller with the encoder/decoder of Tsukagoshi '847. The motivation to combine is 
suggested by Buchanan '490 which discloses: the automatic formatter of Buchanan 
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'490, Ellozy '060, Rutledge '97 and Garbinato '97 operates during run-time (Buchanan 
'490: col. 3, Ins. 11-15) and further that application of the automatic formatter of 
Buchanan '490, Ellozy '060, and Garbinato '97 provides the advantage of handling 
unpredictable data changes such as that of the runtime subtitle to video/audio matching 
of Tsukagoshi '847 (Buchanan '490: col. 3, Ins. 40-47; col. 6, Ins. 7-10). 

9. Claim 10 is rejected under 35 U.S.C. 103(a) as being unpatentable over 
Buchanan '490, Ellozy '060, Rutledge '97, Garbinato '97, and Tsukagoshi '847 in view 
of Kim '92. 
Claim 10: 

Regarding Claim 10, Buchanan '490, Ellozy '060, Rutledge '97, Garbinato '97, 
and Tsukagoshi '847 in combination disclose all the limitations of Claim 9 (supra). 
Buchanan '490, Ellozy '060, Rutledge '97, Garbinato '97, and Tsukagoshi '847 further 
disclose that the realization specifies a media stream (Buchanan '490: col. 57, Ins. 11- 
13). However, Buchanan '490, Ellozy '060, Rutledge '97, Garbinato '97, and 
Tsukagoshi '847 in combination do not explicitly disclose: system further comprising: a 
tree aligner configured to determine a tree structure for said media stream. 

Kim '92 discloses a synchronization relation tree. Specifically, Kim '92 discloses: 
the system further comprising: a tree aligner configured to determine a tree structure for 
said media stream (Kim '92: Abstract). 

It would have been obvious to a person having ordinary skill in the art to apply 
the synchronization relation tree of Kim '92 to the Buchanan '490, Ellozy '060, Rutledge 
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'97, Garbinato '97, and Tsukagoshi '847 in combination. The motivation to combine is 
suggested by Kim '92 which discloses the synchronization relation tree provides for both 
"temporal relationship consistency" and "dynamic schedule completion" and further is 
better suited for an object-oriented implementation (Kim '92: p.3, In. 38 to p. 4, In. 3). 

10. Claims 11-12 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Buchanan '490, Ellozy '060, Rutledge '97, Garbinato '97, Tsukagoshi '847, and Kim '92 
in combination in further view of the publication, "Detection of Target Speakers in Audio 
Databases," by Magrin-Chagnolleau, published by ICASSP, 1999 (hereafter Magrin- 
Chagnolleau '99). 
Claims 11-12: 

Regarding Claims 11-12, Buchanan '490, Ellozy '060, Rutledge '97, Garbinato 
'97, Tsukagoshi '847, and Kim '92 in combination disclose all the limitations of Claim 10 
(supra). However, Buchanan '490, Ellozy '060, Rutledge '97, Garbinato '97, Tsukagoshi 
'847, and Kim '92 in combination do not explicitly disclose: 

- (Claim 1 1 ) means for detecting speech and non-speech boundaries; and 

- (Claim 12) means for detecting transitions and speaker changes. 
Magrin-Chagnolleau '99 disclose: means for detecting speech and non-speech 

boundaries and means for detecting transitions and speaker changes (Magrin- 
Chagnolleau '99: Abstract; Section 4 titled, "Detection Algorithm"). 

It would have been obvious to a person having ordinary skill in the art to apply 
the means of Magrin-Chagnolleau '99 to the Buchanan '490, Ellozy '060, Rutledge '97, 
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Garbinato '97, Tsukagoshi '847, and Kim '92 combination. The motivation to 
accomplish said application is suggested by Magrin-Chagnolleau '99 which discloses, 
the advantages of automatically detecting "useful cues to segment, classify, and 
organize" audio data using multiple speakers (Magrin-Chagnolleau '99: Abstract, 
Section 1, titled, "Introduction."). 



Response to Arguments 

1 1 . Applicants' arguments filed 09 September 2005 have been fully considered but 
they are not persuasive. 

Referring to applicants' remarks on pages 8-9 regarding the Section 1 03 
rejections over Buchanan in view of Ellozy: Applicants argued that both Buchanan and 
Ellozy use temporal information to align one representation with another representation. 

The examiner disagrees for the following reasons: Applicants' arguments 
regarding Ellozy's Timer 16, Timer Alignment module 42, and time aligner 106 (in 
Figures 2 and 3) are not germane to the question at hand. Specifically, these elements 
in Ellozy are involved in aligning Decoded Text (machine-translated text from 
audio/video) with the audio or video timestamps (i.e. to temporally align the decoded 
text with the audio/video itself - creating the "first synchronized representation" of the 
claims). Although the Decoded Text is one of the two versions (an actual text transcript 
is the other), timing/temporal information is NOT used to synchronize the two versions 
together. As described in col. 9, In. 60 to col. 10, In. 15 of Ellozy, and further shown in 
Fig. 5, synchronization of the two versions (the Decoded Text and the Index (transcript) 
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Text) is performed based SOLELY on the semantic structure information, because 
temporal information is either not available or is inaccurate. Time is simply shown as a 
reference in Fig. 5, to note that the Decoded Text is time-aligned with the actual 
audio/video itself. 1 However, note that the Index (transcript) Text does not directly time- 
match the decoded text (i.e. DT 4 is matched to T5 and DTs is matched to T7). This 
contradicts applicants' arguments altogether. 

Thus, Ellozy (within the combination) does teach "synchronizing said at least one 
second representation [Index (transcript) Text] with said first synchronized 
representation [Decoded Text] and said realization [actual audio/video] using said 
semantic structure association information" (Claim 1 ); aligning said determined structure 
information of said first representation and said at least one second representation 
using said semantic structure association information in a form lacking temporal 
information (Claim 3); and wherein the step of synchronizing... is done using only said 
semantic structure association information (claim 20) as claimed. 

Referring to applicants' remarks on pages 9-1 1 regarding the prior element of 
claim 4 (now incorporated in claims 1, 8 and 13): Applicants argued that the 
combination fails to teach or suggest producing a web of relations between a structural 
view of at least one version of content and said first representation as claimed. 

The examiner disagrees for the following reasons: Both Buchanan and Ellozy 
create links (relations) between a structural view of the at least one version of content 
and said first representation. That is, they each (and the combination as a whole) 

1 It is noted that any synchronization is necessarily 'temporal' to some degree by the very definition/nature 
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create a data structure having links between a structural view of the at least one version 
of content and said first representation (Buchanan: col. 24, Ins. 11-15; col. 57, Ins. 51- 
63; col. 58, Ins. 9-23 - Note: schedule commands within data structure link events from 
content and representations; Ellozy: See Abstract). Applicants' instant specification 
describes the creation of "a web of relations" between the structural view and the first 
representation as the realization of such a data structure in a HyTime document. Thus, 
the only difference between the combination of Buchanan and Ellozy and the invention 
of claim 1 is the creation of "a web of relations." Rutledge discloses realizations of data 
structure links within HyTime, among other things. Since applicants' specification 
describes the "web of relations" as a HyTime realization, the combination of Buchanan, 
Ellozy and Rutledge does teach the claimed web of relations by realizing 
Buchanan's/Ellozy's links in a HyTime document. 

The remainder of applicants' remarks substantially repeat one or more of the 
arguments addressed above. Therefore, the examiner responds in kind. 

Conclusion 

12. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Brian Goddard whose telephone number is 571-272- 
4020. The examiner can normally be reached on M-F, 9 AM - 5 PM. 



of synchronization. 
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If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Safet Metjahic can be reached on 571-272-4023. The fax phone number for 
the organization where this application or proceeding is assigned is 571-273-8300. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). 

bdg 

09 December 2005 
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